Dataset statistics
| Number of variables | 22 |
|---|---|
| Number of observations | 84548 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 14.2 MiB |
| Average record size in memory | 176.0 B |
Variable types
| Numeric | 8 |
|---|---|
| Categorical | 14 |
EASE-MENT has constant value " " | Constant |
NEIGHBORHOOD has a high cardinality: 254 distinct values | High cardinality |
BUILDING CLASS AT PRESENT has a high cardinality: 167 distinct values | High cardinality |
ADDRESS has a high cardinality: 67563 distinct values | High cardinality |
APARTMENT NUMBER has a high cardinality: 3989 distinct values | High cardinality |
LAND SQUARE FEET has a high cardinality: 6062 distinct values | High cardinality |
GROSS SQUARE FEET has a high cardinality: 5691 distinct values | High cardinality |
BUILDING CLASS AT TIME OF SALE has a high cardinality: 166 distinct values | High cardinality |
SALE PRICE has a high cardinality: 10008 distinct values | High cardinality |
SALE DATE has a high cardinality: 364 distinct values | High cardinality |
BLOCK is highly overall correlated with Unnamed: 0 and 1 other fields | High correlation |
ZIP CODE is highly overall correlated with BOROUGH and 2 other fields | High correlation |
RESIDENTIAL UNITS is highly overall correlated with TOTAL UNITS | High correlation |
TOTAL UNITS is highly overall correlated with RESIDENTIAL UNITS and 1 other fields | High correlation |
BOROUGH is highly overall correlated with Unnamed: 0 and 4 other fields | High correlation |
BUILDING CLASS CATEGORY is highly overall correlated with BOROUGH and 5 other fields | High correlation |
TAX CLASS AT PRESENT is highly overall correlated with BOROUGH and 3 other fields | High correlation |
TAX CLASS AT TIME OF SALE is highly overall correlated with BUILDING CLASS CATEGORY and 1 other fields | High correlation |
Unnamed: 0 is highly overall correlated with BOROUGH and 1 other fields | High correlation |
LOT is highly overall correlated with BUILDING CLASS CATEGORY | High correlation |
COMMERCIAL UNITS is highly overall correlated with TOTAL UNITS | High correlation |
YEAR BUILT is highly overall correlated with BUILDING CLASS CATEGORY | High correlation |
RESIDENTIAL UNITS is highly skewed (γ1 = 60.70273283) | Skewed |
COMMERCIAL UNITS is highly skewed (γ1 = 214.4011234) | Skewed |
TOTAL UNITS is highly skewed (γ1 = 63.44833684) | Skewed |
ZIP CODE has 982 (1.2%) zeros | Zeros |
RESIDENTIAL UNITS has 24783 (29.3%) zeros | Zeros |
COMMERCIAL UNITS has 79429 (93.9%) zeros | Zeros |
TOTAL UNITS has 19762 (23.4%) zeros | Zeros |
YEAR BUILT has 6970 (8.2%) zeros | Zeros |
Reproduction
| Analysis started | 2022-12-11 06:35:24.118623 |
|---|---|
| Analysis finished | 2022-12-11 06:35:39.544157 |
| Duration | 15.43 seconds |
| Software version | pandas-profiling vv3.5.0 |
| Download configuration | config.json |
Unnamed: 0
Real number (ℝ)
| Distinct | 26736 |
|---|---|
| Distinct (%) | 31.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10344.36 |
| Minimum | 4 |
|---|---|
| Maximum | 26739 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 660.7 KiB |
Quantile statistics
| Minimum | 4 |
|---|---|
| 5-th percentile | 849 |
| Q1 | 4231 |
| median | 8942 |
| Q3 | 15987.25 |
| 95-th percentile | 23281 |
| Maximum | 26739 |
| Range | 26735 |
| Interquartile range (IQR) | 11756.25 |
Descriptive statistics
| Standard deviation | 7151.7794 |
|---|---|
| Coefficient of variation (CV) | 0.69136994 |
| Kurtosis | -0.92822006 |
| Mean | 10344.36 |
| Median Absolute Deviation (MAD) | 5586.5 |
| Skewness | 0.44078076 |
| Sum | 8.7459494 × 108 |
| Variance | 51147949 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4 | 5 | < 0.1% |
| 4699 | 5 | < 0.1% |
| 4710 | 5 | < 0.1% |
| 4709 | 5 | < 0.1% |
| 4708 | 5 | < 0.1% |
| 4707 | 5 | < 0.1% |
| 4706 | 5 | < 0.1% |
| 4705 | 5 | < 0.1% |
| 4704 | 5 | < 0.1% |
| 4703 | 5 | < 0.1% |
| Other values (26726) | 84498 |
| Value | Count | Frequency (%) |
| 4 | 5 | |
| 5 | 5 | |
| 6 | 5 | |
| 7 | 5 | |
| 8 | 5 | |
| 9 | 5 | |
| 10 | 5 | |
| 11 | 5 | |
| 12 | 5 | |
| 13 | 5 |
| Value | Count | Frequency (%) |
| 26739 | 1 | |
| 26738 | 1 | |
| 26737 | 1 | |
| 26736 | 1 | |
| 26735 | 1 | |
| 26734 | 1 | |
| 26733 | 1 | |
| 26732 | 1 | |
| 26731 | 1 | |
| 26730 | 1 |
BOROUGH
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 660.7 KiB |
| 4 | |
|---|---|
| 3 | |
| 1 | |
| 5 | |
| 2 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 84548 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 4 | 26736 | |
| 3 | 24047 | |
| 1 | 18306 | |
| 5 | 8410 | 9.9% |
| 2 | 7049 | 8.3% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 4 | 26736 | |
| 3 | 24047 | |
| 1 | 18306 | |
| 5 | 8410 | 9.9% |
| 2 | 7049 | 8.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| 4 | 26736 | |
| 3 | 24047 | |
| 1 | 18306 | |
| 5 | 8410 | 9.9% |
| 2 | 7049 | 8.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 84548 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 26736 | |
| 3 | 24047 | |
| 1 | 18306 | |
| 5 | 8410 | 9.9% |
| 2 | 7049 | 8.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 84548 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 4 | 26736 | |
| 3 | 24047 | |
| 1 | 18306 | |
| 5 | 8410 | 9.9% |
| 2 | 7049 | 8.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 84548 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 4 | 26736 | |
| 3 | 24047 | |
| 1 | 18306 | |
| 5 | 8410 | 9.9% |
| 2 | 7049 | 8.3% |
NEIGHBORHOOD
Categorical
| Distinct | 254 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 660.7 KiB |
| FLUSHING-NORTH | 3068 |
|---|---|
| UPPER EAST SIDE (59-79) | 1736 |
| UPPER EAST SIDE (79-96) | 1590 |
| UPPER WEST SIDE (59-79) | 1439 |
| BEDFORD STUYVESANT | 1436 |
| Other values (249) |
Length
| Max length | 25 |
|---|---|
| Median length | 20 |
| Mean length | 13.144983 |
| Min length | 4 |
Characters and Unicode
| Total characters | 1111382 |
|---|---|
| Distinct characters | 38 |
| Distinct categories | 7 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 4 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | ALPHABET CITY |
|---|---|
| 2nd row | ALPHABET CITY |
| 3rd row | ALPHABET CITY |
| 4th row | ALPHABET CITY |
| 5th row | ALPHABET CITY |
Common Values
| Value | Count | Frequency (%) |
| FLUSHING-NORTH | 3068 | 3.6% |
| UPPER EAST SIDE (59-79) | 1736 | 2.1% |
| UPPER EAST SIDE (79-96) | 1590 | 1.9% |
| UPPER WEST SIDE (59-79) | 1439 | 1.7% |
| BEDFORD STUYVESANT | 1436 | 1.7% |
| MIDTOWN EAST | 1418 | 1.7% |
| BOROUGH PARK | 1245 | 1.5% |
| ASTORIA | 1216 | 1.4% |
| BAYSIDE | 1150 | 1.4% |
| FOREST HILLS | 1069 | 1.3% |
| Other values (244) | 69181 |
Length
| Value | Count | Frequency (%) |
| east | 6664 | 4.4% |
| side | 6484 | 4.3% |
| upper | 6471 | 4.3% |
| park | 6273 | 4.1% |
| heights | 4268 | 2.8% |
| west | 4034 | 2.7% |
| 59-79 | 3175 | 2.1% |
| flushing-north | 3068 | 2.0% |
| hill | 2695 | 1.8% |
| bay | 2646 | 1.7% |
| Other values (285) | 106120 |
Most occurring characters
| Value | Count | Frequency (%) |
| E | 104344 | 9.4% |
| A | 80142 | 7.2% |
| S | 79590 | 7.2% |
| R | 72558 | 6.5% |
| 67350 | 6.1% | |
| I | 64910 | 5.8% |
| O | 62467 | 5.6% |
| T | 62184 | 5.6% |
| L | 61705 | 5.6% |
| N | 60529 | 5.4% |
| Other values (28) | 395603 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 983670 | |
| Space Separator | 67350 | 6.1% |
| Decimal Number | 25141 | 2.3% |
| Dash Punctuation | 19371 | 1.7% |
| Close Punctuation | 6182 | 0.6% |
| Open Punctuation | 6182 | 0.6% |
| Other Punctuation | 3486 | 0.3% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| E | 104344 | 10.6% |
| A | 80142 | 8.1% |
| S | 79590 | 8.1% |
| R | 72558 | 7.4% |
| I | 64910 | 6.6% |
| O | 62467 | 6.4% |
| T | 62184 | 6.3% |
| L | 61705 | 6.3% |
| N | 60529 | 6.2% |
| H | 51508 | 5.2% |
| Other values (16) | 283733 |
Decimal Number
| Value | Count | Frequency (%) |
| 9 | 11951 | |
| 7 | 5769 | |
| 6 | 3365 | 13.4% |
| 5 | 3175 | 12.6% |
| 1 | 826 | 3.3% |
| 0 | 55 | 0.2% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 2230 | |
| . | 1256 |
Space Separator
| Value | Count | Frequency (%) |
| 67350 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 19371 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 6182 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 6182 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 983670 | |
| Common | 127712 | 11.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| E | 104344 | 10.6% |
| A | 80142 | 8.1% |
| S | 79590 | 8.1% |
| R | 72558 | 7.4% |
| I | 64910 | 6.6% |
| O | 62467 | 6.4% |
| T | 62184 | 6.3% |
| L | 61705 | 6.3% |
| N | 60529 | 6.2% |
| H | 51508 | 5.2% |
| Other values (16) | 283733 |
Common
| Value | Count | Frequency (%) |
| 67350 | ||
| - | 19371 | 15.2% |
| 9 | 11951 | 9.4% |
| ) | 6182 | 4.8% |
| ( | 6182 | 4.8% |
| 7 | 5769 | 4.5% |
| 6 | 3365 | 2.6% |
| 5 | 3175 | 2.5% |
| / | 2230 | 1.7% |
| . | 1256 | 1.0% |
| Other values (2) | 881 | 0.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1111382 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| E | 104344 | 9.4% |
| A | 80142 | 7.2% |
| S | 79590 | 7.2% |
| R | 72558 | 6.5% |
| 67350 | 6.1% | |
| I | 64910 | 5.8% |
| O | 62467 | 5.6% |
| T | 62184 | 5.6% |
| L | 61705 | 5.6% |
| N | 60529 | 5.4% |
| Other values (28) | 395603 |
BUILDING CLASS CATEGORY
Categorical
| Distinct | 47 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 660.7 KiB |
| 01 ONE FAMILY DWELLINGS | |
|---|---|
| 02 TWO FAMILY DWELLINGS | |
| 13 CONDOS - ELEVATOR APARTMENTS | |
| 10 COOPS - ELEVATOR APARTMENTS | |
| 03 THREE FAMILY DWELLINGS | |
| Other values (42) |
Length
| Max length | 44 |
|---|---|
| Median length | 43 |
| Mean length | 43.000509 |
| Min length | 43 |
Characters and Unicode
| Total characters | 3635607 |
|---|---|
| Distinct characters | 36 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 07 RENTALS - WALKUP APARTMENTS |
|---|---|
| 2nd row | 07 RENTALS - WALKUP APARTMENTS |
| 3rd row | 07 RENTALS - WALKUP APARTMENTS |
| 4th row | 07 RENTALS - WALKUP APARTMENTS |
| 5th row | 07 RENTALS - WALKUP APARTMENTS |
Common Values
| Value | Count | Frequency (%) |
| 01 ONE FAMILY DWELLINGS | 18235 | |
| 02 TWO FAMILY DWELLINGS | 15828 | |
| 13 CONDOS - ELEVATOR APARTMENTS | 12989 | |
| 10 COOPS - ELEVATOR APARTMENTS | 12902 | |
| 03 THREE FAMILY DWELLINGS | 4384 | 5.2% |
| 07 RENTALS - WALKUP APARTMENTS | 3466 | 4.1% |
| 09 COOPS - WALKUP APARTMENTS | 2767 | 3.3% |
| 04 TAX CLASS 1 CONDOS | 1656 | 2.0% |
| 44 CONDO PARKING | 1441 | 1.7% |
| 15 CONDOS - 2-10 UNIT RESIDENTIAL | 1281 | 1.5% |
| Other values (37) | 9599 |
Length
| Value | Count | Frequency (%) |
| family | 38447 | 10.3% |
| dwellings | 38447 | 10.3% |
| 35824 | 9.6% | |
| apartments | 33432 | 8.9% |
| elevator | 26273 | 7.0% |
| 01 | 18235 | 4.9% |
| one | 18235 | 4.9% |
| condos | 16978 | 4.5% |
| coops | 16870 | 4.5% |
| two | 15828 | 4.2% |
| Other values (102) | 115289 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1746496 | ||
| E | 165470 | 4.6% |
| L | 163874 | 4.5% |
| A | 162055 | 4.5% |
| O | 141686 | 3.9% |
| T | 129584 | 3.6% |
| N | 127409 | 3.5% |
| S | 125167 | 3.4% |
| I | 90952 | 2.5% |
| R | 76041 | 2.1% |
| Other values (26) | 706873 |
Most occurring categories
| Value | Count | Frequency (%) |
| Space Separator | 1746496 | |
| Uppercase Letter | 1672138 | |
| Decimal Number | 178488 | 4.9% |
| Dash Punctuation | 38292 | 1.1% |
| Other Punctuation | 193 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| E | 165470 | 9.9% |
| L | 163874 | 9.8% |
| A | 162055 | 9.7% |
| O | 141686 | 8.5% |
| T | 129584 | 7.7% |
| N | 127409 | 7.6% |
| S | 125167 | 7.5% |
| I | 90952 | 5.4% |
| R | 76041 | 4.5% |
| M | 74296 | 4.4% |
| Other values (13) | 415604 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 63426 | |
| 1 | 54500 | |
| 2 | 21413 | 12.0% |
| 3 | 19069 | 10.7% |
| 4 | 7517 | 4.2% |
| 7 | 5345 | 3.0% |
| 9 | 3386 | 1.9% |
| 5 | 2784 | 1.6% |
| 6 | 560 | 0.3% |
| 8 | 488 | 0.3% |
Space Separator
| Value | Count | Frequency (%) |
| 1746496 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 38292 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 193 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1963469 | |
| Latin | 1672138 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| E | 165470 | 9.9% |
| L | 163874 | 9.8% |
| A | 162055 | 9.7% |
| O | 141686 | 8.5% |
| T | 129584 | 7.7% |
| N | 127409 | 7.6% |
| S | 125167 | 7.5% |
| I | 90952 | 5.4% |
| R | 76041 | 4.5% |
| M | 74296 | 4.4% |
| Other values (13) | 415604 |
Common
| Value | Count | Frequency (%) |
| 1746496 | ||
| 0 | 63426 | 3.2% |
| 1 | 54500 | 2.8% |
| - | 38292 | 2.0% |
| 2 | 21413 | 1.1% |
| 3 | 19069 | 1.0% |
| 4 | 7517 | 0.4% |
| 7 | 5345 | 0.3% |
| 9 | 3386 | 0.2% |
| 5 | 2784 | 0.1% |
| Other values (3) | 1241 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3635607 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1746496 | ||
| E | 165470 | 4.6% |
| L | 163874 | 4.5% |
| A | 162055 | 4.5% |
| O | 141686 | 3.9% |
| T | 129584 | 3.6% |
| N | 127409 | 3.5% |
| S | 125167 | 3.4% |
| I | 90952 | 2.5% |
| R | 76041 | 2.1% |
| Other values (26) | 706873 |
TAX CLASS AT PRESENT
Categorical
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 660.7 KiB |
| 1 | |
|---|---|
| 2 | |
| 4 | |
| 2A | 2521 |
| 2C | 1915 |
| Other values (6) |
Length
| Max length | 2 |
|---|---|
| Median length | 1 |
| Mean length | 1.0959692 |
| Min length | 1 |
Characters and Unicode
| Total characters | 92662 |
|---|---|
| Distinct characters | 8 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2A |
|---|---|
| 2nd row | 2 |
| 3rd row | 2 |
| 4th row | 2B |
| 5th row | 2A |
Common Values
| Value | Count | Frequency (%) |
| 1 | 38633 | |
| 2 | 30919 | |
| 4 | 6140 | 7.3% |
| 2A | 2521 | 3.0% |
| 2C | 1915 | 2.3% |
| 1A | 1444 | 1.7% |
| 1B | 1234 | 1.5% |
| 2B | 814 | 1.0% |
| 738 | 0.9% | |
| 1C | 186 | 0.2% |
Length
| Value | Count | Frequency (%) |
| 1 | 38633 | |
| 2 | 30919 | |
| 4 | 6140 | 7.3% |
| 2a | 2521 | 3.0% |
| 2c | 1915 | 2.3% |
| 1a | 1444 | 1.7% |
| 1b | 1234 | 1.5% |
| 2b | 814 | 1.0% |
| 1c | 186 | 0.2% |
| 3 | 4 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 41497 | |
| 2 | 36169 | |
| 4 | 6140 | 6.6% |
| A | 3965 | 4.3% |
| C | 2101 | 2.3% |
| B | 2048 | 2.2% |
| 738 | 0.8% | |
| 3 | 4 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 83810 | |
| Uppercase Letter | 8114 | 8.8% |
| Space Separator | 738 | 0.8% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 41497 | |
| 2 | 36169 | |
| 4 | 6140 | 7.3% |
| 3 | 4 | < 0.1% |
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 3965 | |
| C | 2101 | |
| B | 2048 |
Space Separator
| Value | Count | Frequency (%) |
| 738 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 84548 | |
| Latin | 8114 | 8.8% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 41497 | |
| 2 | 36169 | |
| 4 | 6140 | 7.3% |
| 738 | 0.9% | |
| 3 | 4 | < 0.1% |
Latin
| Value | Count | Frequency (%) |
| A | 3965 | |
| C | 2101 | |
| B | 2048 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 92662 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 41497 | |
| 2 | 36169 | |
| 4 | 6140 | 6.6% |
| A | 3965 | 4.3% |
| C | 2101 | 2.3% |
| B | 2048 | 2.2% |
| 738 | 0.8% | |
| 3 | 4 | < 0.1% |
BLOCK
Real number (ℝ)
| Distinct | 11566 |
|---|---|
| Distinct (%) | 13.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4237.219 |
| Minimum | 1 |
|---|---|
| Maximum | 16322 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 660.7 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 276 |
| Q1 | 1322.75 |
| median | 3311 |
| Q3 | 6281 |
| 95-th percentile | 11615.65 |
| Maximum | 16322 |
| Range | 16321 |
| Interquartile range (IQR) | 4958.25 |
Descriptive statistics
| Standard deviation | 3568.2634 |
|---|---|
| Coefficient of variation (CV) | 0.84212391 |
| Kurtosis | 0.59689403 |
| Mean | 4237.219 |
| Median Absolute Deviation (MAD) | 2212 |
| Skewness | 1.049335 |
| Sum | 3.5824839 × 108 |
| Variance | 12732504 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 5066 | 404 | 0.5% |
| 16 | 255 | 0.3% |
| 2135 | 211 | 0.2% |
| 4978 | 187 | 0.2% |
| 1171 | 181 | 0.2% |
| 8489 | 170 | 0.2% |
| 1226 | 168 | 0.2% |
| 3944 | 152 | 0.2% |
| 31 | 135 | 0.2% |
| 1129 | 135 | 0.2% |
| Other values (11556) | 82550 |
| Value | Count | Frequency (%) |
| 1 | 26 | < 0.1% |
| 3 | 5 | < 0.1% |
| 5 | 1 | < 0.1% |
| 6 | 2 | < 0.1% |
| 7 | 2 | < 0.1% |
| 8 | 3 | < 0.1% |
| 10 | 10 | < 0.1% |
| 13 | 2 | < 0.1% |
| 15 | 25 | < 0.1% |
| 16 | 255 |
| Value | Count | Frequency (%) |
| 16322 | 1 | < 0.1% |
| 16319 | 1 | < 0.1% |
| 16317 | 3 | |
| 16316 | 2 | |
| 16315 | 2 | |
| 16313 | 2 | |
| 16310 | 2 | |
| 16305 | 4 | |
| 16304 | 3 | |
| 16300 | 2 |
LOT
Real number (ℝ)
| Distinct | 2627 |
|---|---|
| Distinct (%) | 3.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 376.22401 |
| Minimum | 1 |
|---|---|
| Maximum | 9106 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 660.7 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 22 |
| median | 50 |
| Q3 | 1001 |
| 95-th percentile | 1403 |
| Maximum | 9106 |
| Range | 9105 |
| Interquartile range (IQR) | 979 |
Descriptive statistics
| Standard deviation | 658.13681 |
|---|---|
| Coefficient of variation (CV) | 1.7493216 |
| Kurtosis | 24.937658 |
| Mean | 376.22401 |
| Median Absolute Deviation (MAD) | 38 |
| Skewness | 3.5006793 |
| Sum | 31808988 |
| Variance | 433144.07 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 4125 | 4.9% |
| 20 | 983 | 1.2% |
| 12 | 972 | 1.1% |
| 40 | 935 | 1.1% |
| 23 | 911 | 1.1% |
| 10 | 895 | 1.1% |
| 15 | 894 | 1.1% |
| 29 | 891 | 1.1% |
| 25 | 879 | 1.0% |
| 19 | 874 | 1.0% |
| Other values (2617) | 72189 |
| Value | Count | Frequency (%) |
| 1 | 4125 | |
| 2 | 742 | 0.9% |
| 3 | 811 | 1.0% |
| 4 | 685 | 0.8% |
| 5 | 805 | 1.0% |
| 6 | 837 | 1.0% |
| 7 | 830 | 1.0% |
| 8 | 787 | 0.9% |
| 9 | 783 | 0.9% |
| 10 | 895 | 1.1% |
| Value | Count | Frequency (%) |
| 9106 | 1 | |
| 9099 | 1 | |
| 9085 | 1 | |
| 9081 | 1 | |
| 9080 | 1 | |
| 9056 | 1 | |
| 9053 | 2 | |
| 9050 | 1 | |
| 9049 | 1 | |
| 9040 | 1 |
EASE-MENT
Categorical
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 660.7 KiB |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 84548 |
|---|---|
| Distinct characters | 1 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | |
|---|---|
| 2nd row | |
| 3rd row | |
| 4th row | |
| 5th row |
Common Values
| Value | Count | Frequency (%) |
| 84548 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring characters
| Value | Count | Frequency (%) |
| 84548 |
Most occurring categories
| Value | Count | Frequency (%) |
| Space Separator | 84548 |
Most frequent character per category
Space Separator
| Value | Count | Frequency (%) |
| 84548 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 84548 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 84548 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 84548 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 84548 |
BUILDING CLASS AT PRESENT
Categorical
| Distinct | 167 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 660.7 KiB |
| D4 | |
|---|---|
| R4 | |
| A1 | |
| A5 | |
| B2 | |
| Other values (162) |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 1.9912712 |
| Min length | 1 |
Characters and Unicode
| Total characters | 168358 |
|---|---|
| Distinct characters | 36 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 13 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | C2 |
|---|---|
| 2nd row | C7 |
| 3rd row | C7 |
| 4th row | C4 |
| 5th row | C2 |
Common Values
| Value | Count | Frequency (%) |
| D4 | 12663 | |
| R4 | 12482 | |
| A1 | 6753 | 8.0% |
| A5 | 5683 | 6.7% |
| B2 | 4923 | 5.8% |
| B1 | 4749 | 5.6% |
| C0 | 4379 | 5.2% |
| B3 | 3824 | 4.5% |
| A2 | 2821 | 3.3% |
| C6 | 2760 | 3.3% |
| Other values (157) | 23511 |
Length
| Value | Count | Frequency (%) |
| d4 | 12663 | |
| r4 | 12482 | |
| a1 | 6753 | 8.1% |
| a5 | 5683 | 6.8% |
| b2 | 4923 | 5.9% |
| b1 | 4749 | 5.7% |
| c0 | 4379 | 5.2% |
| b3 | 3824 | 4.6% |
| a2 | 2821 | 3.4% |
| c6 | 2760 | 3.3% |
| Other values (156) | 22773 |
Most occurring characters
| Value | Count | Frequency (%) |
| 4 | 26151 | |
| R | 20291 | |
| A | 17872 | |
| B | 15514 | |
| 1 | 15395 | |
| D | 13289 | |
| 2 | 10741 | |
| C | 10610 | |
| 3 | 7128 | 4.2% |
| 0 | 6512 | 3.9% |
| Other values (26) | 24855 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 86498 | |
| Decimal Number | 81122 | |
| Space Separator | 738 | 0.4% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| R | 20291 | |
| A | 17872 | |
| B | 15514 | |
| D | 13289 | |
| C | 10610 | |
| S | 2194 | 2.5% |
| G | 1805 | 2.1% |
| V | 1700 | 2.0% |
| K | 1092 | 1.3% |
| O | 348 | 0.4% |
| Other values (15) | 1783 | 2.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 26151 | |
| 1 | 15395 | |
| 2 | 10741 | |
| 3 | 7128 | 8.8% |
| 0 | 6512 | 8.0% |
| 5 | 6242 | 7.7% |
| 9 | 4812 | 5.9% |
| 6 | 3182 | 3.9% |
| 7 | 769 | 0.9% |
| 8 | 190 | 0.2% |
Space Separator
| Value | Count | Frequency (%) |
| 738 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 86498 | |
| Common | 81860 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| R | 20291 | |
| A | 17872 | |
| B | 15514 | |
| D | 13289 | |
| C | 10610 | |
| S | 2194 | 2.5% |
| G | 1805 | 2.1% |
| V | 1700 | 2.0% |
| K | 1092 | 1.3% |
| O | 348 | 0.4% |
| Other values (15) | 1783 | 2.1% |
Common
| Value | Count | Frequency (%) |
| 4 | 26151 | |
| 1 | 15395 | |
| 2 | 10741 | |
| 3 | 7128 | 8.7% |
| 0 | 6512 | 8.0% |
| 5 | 6242 | 7.6% |
| 9 | 4812 | 5.9% |
| 6 | 3182 | 3.9% |
| 7 | 769 | 0.9% |
| 738 | 0.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 168358 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 4 | 26151 | |
| R | 20291 | |
| A | 17872 | |
| B | 15514 | |
| 1 | 15395 | |
| D | 13289 | |
| 2 | 10741 | |
| C | 10610 | |
| 3 | 7128 | 4.2% |
| 0 | 6512 | 3.9% |
| Other values (26) | 24855 |
ADDRESS
Categorical
| Distinct | 67563 |
|---|---|
| Distinct (%) | 79.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 660.7 KiB |
| 131-05 40TH ROAD | 210 |
|---|---|
| 429 KENT AVENUE | 158 |
| 169 WEST 95TH STREET | 153 |
| 131-03 40TH ROAD | 147 |
| 265 STATE STREET | 127 |
| Other values (67558) |
Length
| Max length | 34 |
|---|---|
| Median length | 30 |
| Mean length | 19.262644 |
| Min length | 5 |
Characters and Unicode
| Total characters | 1628618 |
|---|---|
| Distinct characters | 45 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 62078 ? |
|---|---|
| Unique (%) | 73.4% |
Sample
| 1st row | 153 AVENUE B |
|---|---|
| 2nd row | 234 EAST 4TH STREET |
| 3rd row | 197 EAST 3RD STREET |
| 4th row | 154 EAST 7TH STREET |
| 5th row | 301 EAST 10TH STREET |
Common Values
| Value | Count | Frequency (%) |
| 131-05 40TH ROAD | 210 | 0.2% |
| 429 KENT AVENUE | 158 | 0.2% |
| 169 WEST 95TH STREET | 153 | 0.2% |
| 131-03 40TH ROAD | 147 | 0.2% |
| 265 STATE STREET | 127 | 0.2% |
| 550 VANDERBILT AVENUE | 126 | 0.1% |
| 50 WEST STREET | 115 | 0.1% |
| 39TH AVENUE | 108 | 0.1% |
| 30 PARK PLACE | 107 | 0.1% |
| 1809 EMMONS AVENUE | 103 | 0.1% |
| Other values (67553) | 83194 |
Length
| Value | Count | Frequency (%) |
| street | 39956 | 13.7% |
| avenue | 24787 | 8.5% |
| east | 9802 | 3.4% |
| west | 6651 | 2.3% |
| road | 3501 | 1.2% |
| place | 3055 | 1.1% |
| ave | 1656 | 0.6% |
| park | 1435 | 0.5% |
| boulevard | 1411 | 0.5% |
| st | 1410 | 0.5% |
| Other values (21800) | 197250 |
Most occurring characters
| Value | Count | Frequency (%) |
| 236125 | ||
| E | 191590 | 11.8% |
| T | 146406 | 9.0% |
| R | 81143 | 5.0% |
| 1 | 79718 | 4.9% |
| A | 79687 | 4.9% |
| S | 77236 | 4.7% |
| N | 58986 | 3.6% |
| 2 | 52644 | 3.2% |
| 3 | 42496 | 2.6% |
| Other values (35) | 582587 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 948427 | |
| Decimal Number | 400922 | |
| Space Separator | 236125 | 14.5% |
| Dash Punctuation | 25141 | 1.5% |
| Other Punctuation | 18003 | 1.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| E | 191590 | |
| T | 146406 | |
| R | 81143 | |
| A | 79687 | |
| S | 77236 | |
| N | 58986 | 6.2% |
| H | 37130 | 3.9% |
| U | 34792 | 3.7% |
| V | 34168 | 3.6% |
| O | 31485 | 3.3% |
| Other values (16) | 175804 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 79718 | |
| 2 | 52644 | |
| 3 | 42496 | |
| 5 | 40828 | |
| 0 | 39582 | |
| 4 | 37706 | |
| 6 | 30373 | 7.6% |
| 7 | 27885 | 7.0% |
| 8 | 26154 | 6.5% |
| 9 | 23536 | 5.9% |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 16730 | |
| / | 743 | 4.1% |
| . | 457 | 2.5% |
| # | 37 | 0.2% |
| ' | 23 | 0.1% |
| & | 7 | < 0.1% |
| * | 6 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 236125 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 25141 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 948427 | |
| Common | 680191 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| E | 191590 | |
| T | 146406 | |
| R | 81143 | |
| A | 79687 | |
| S | 77236 | |
| N | 58986 | 6.2% |
| H | 37130 | 3.9% |
| U | 34792 | 3.7% |
| V | 34168 | 3.6% |
| O | 31485 | 3.3% |
| Other values (16) | 175804 |
Common
| Value | Count | Frequency (%) |
| 236125 | ||
| 1 | 79718 | 11.7% |
| 2 | 52644 | 7.7% |
| 3 | 42496 | 6.2% |
| 5 | 40828 | 6.0% |
| 0 | 39582 | 5.8% |
| 4 | 37706 | 5.5% |
| 6 | 30373 | 4.5% |
| 7 | 27885 | 4.1% |
| 8 | 26154 | 3.8% |
| Other values (9) | 66680 | 9.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1628618 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 236125 | ||
| E | 191590 | 11.8% |
| T | 146406 | 9.0% |
| R | 81143 | 5.0% |
| 1 | 79718 | 4.9% |
| A | 79687 | 4.9% |
| S | 77236 | 4.7% |
| N | 58986 | 3.6% |
| 2 | 52644 | 3.2% |
| 3 | 42496 | 2.6% |
| Other values (35) | 582587 |
APARTMENT NUMBER
Categorical
| Distinct | 3989 |
|---|---|
| Distinct (%) | 4.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 660.7 KiB |
| 4 | 298 |
|---|---|
| 3A | 295 |
| 2 | 275 |
| 3B | 275 |
| Other values (3984) |
Length
| Max length | 11 |
|---|---|
| Median length | 1 |
| Mean length | 1.3446563 |
| Min length | 1 |
Characters and Unicode
| Total characters | 113688 |
|---|---|
| Distinct characters | 48 |
| Distinct categories | 10 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2458 ? |
|---|---|
| Unique (%) | 2.9% |
Sample
| 1st row | |
|---|---|
| 2nd row | |
| 3rd row | |
| 4th row | |
| 5th row |
Common Values
| Value | Count | Frequency (%) |
| 65496 | ||
| 4 | 298 | 0.4% |
| 3A | 295 | 0.3% |
| 2 | 275 | 0.3% |
| 3B | 275 | 0.3% |
| 2B | 272 | 0.3% |
| 3 | 263 | 0.3% |
| 2A | 263 | 0.3% |
| 1 | 242 | 0.3% |
| 4B | 228 | 0.3% |
| Other values (3979) | 16641 | 19.7% |
Length
| Value | Count | Frequency (%) |
| 4 | 309 | 1.6% |
| 3a | 295 | 1.5% |
| 2 | 285 | 1.5% |
| 3b | 275 | 1.4% |
| 2b | 274 | 1.4% |
| 3 | 270 | 1.4% |
| 2a | 264 | 1.4% |
| 1 | 248 | 1.3% |
| 4b | 228 | 1.2% |
| 4a | 206 | 1.1% |
| Other values (3810) | 16605 |
Most occurring characters
| Value | Count | Frequency (%) |
| 65703 | ||
| 1 | 6322 | 5.6% |
| 2 | 4521 | 4.0% |
| 3 | 3568 | 3.1% |
| 4 | 3105 | 2.7% |
| A | 2640 | 2.3% |
| 5 | 2547 | 2.2% |
| B | 2430 | 2.1% |
| 0 | 2389 | 2.1% |
| 6 | 2131 | 1.9% |
| Other values (38) | 18332 | 16.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Space Separator | 65703 | |
| Decimal Number | 28928 | |
| Uppercase Letter | 18172 | 16.0% |
| Dash Punctuation | 808 | 0.7% |
| Other Punctuation | 65 | 0.1% |
| Math Symbol | 4 | < 0.1% |
| Modifier Symbol | 2 | < 0.1% |
| Open Punctuation | 2 | < 0.1% |
| Close Punctuation | 2 | < 0.1% |
| Lowercase Letter | 2 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 2640 | |
| B | 2430 | |
| C | 1947 | |
| P | 1732 | |
| D | 1379 | |
| E | 1144 | 6.3% |
| H | 1025 | 5.6% |
| F | 872 | 4.8% |
| S | 839 | 4.6% |
| G | 758 | 4.2% |
| Other values (16) | 3406 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 6322 | |
| 2 | 4521 | |
| 3 | 3568 | |
| 4 | 3105 | |
| 5 | 2547 | |
| 0 | 2389 | 8.3% |
| 6 | 2131 | 7.4% |
| 7 | 1635 | 5.7% |
| 8 | 1478 | 5.1% |
| 9 | 1232 | 4.3% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 43 | |
| . | 13 | 20.0% |
| & | 7 | 10.8% |
| # | 2 | 3.1% |
Lowercase Letter
| Value | Count | Frequency (%) |
| b | 1 | |
| c | 1 |
Space Separator
| Value | Count | Frequency (%) |
| 65703 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 808 |
Math Symbol
| Value | Count | Frequency (%) |
| + | 4 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ` | 2 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 2 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 95514 | |
| Latin | 18174 | 16.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 2640 | |
| B | 2430 | |
| C | 1947 | |
| P | 1732 | |
| D | 1379 | |
| E | 1144 | 6.3% |
| H | 1025 | 5.6% |
| F | 872 | 4.8% |
| S | 839 | 4.6% |
| G | 758 | 4.2% |
| Other values (18) | 3408 |
Common
| Value | Count | Frequency (%) |
| 65703 | ||
| 1 | 6322 | 6.6% |
| 2 | 4521 | 4.7% |
| 3 | 3568 | 3.7% |
| 4 | 3105 | 3.3% |
| 5 | 2547 | 2.7% |
| 0 | 2389 | 2.5% |
| 6 | 2131 | 2.2% |
| 7 | 1635 | 1.7% |
| 8 | 1478 | 1.5% |
| Other values (10) | 2115 | 2.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 113688 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 65703 | ||
| 1 | 6322 | 5.6% |
| 2 | 4521 | 4.0% |
| 3 | 3568 | 3.1% |
| 4 | 3105 | 2.7% |
| A | 2640 | 2.3% |
| 5 | 2547 | 2.2% |
| B | 2430 | 2.1% |
| 0 | 2389 | 2.1% |
| 6 | 2131 | 1.9% |
| Other values (38) | 18332 | 16.1% |
| Distinct | 186 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10731.992 |
| Minimum | 0 |
|---|---|
| Maximum | 11694 |
| Zeros | 982 |
| Zeros (%) | 1.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 660.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 10011 |
| Q1 | 10305 |
| median | 11209 |
| Q3 | 11357 |
| 95-th percentile | 11427 |
| Maximum | 11694 |
| Range | 11694 |
| Interquartile range (IQR) | 1052 |
Descriptive statistics
| Standard deviation | 1290.8791 |
|---|---|
| Coefficient of variation (CV) | 0.12028328 |
| Kurtosis | 52.539297 |
| Mean | 10731.992 |
| Median Absolute Deviation (MAD) | 206 |
| Skewness | -6.6563208 |
| Sum | 9.0736843 × 108 |
| Variance | 1666369 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 10314 | 1687 | 2.0% |
| 11354 | 1384 | 1.6% |
| 11201 | 1324 | 1.6% |
| 11235 | 1312 | 1.6% |
| 11234 | 1165 | 1.4% |
| 11375 | 1144 | 1.4% |
| 10312 | 1088 | 1.3% |
| 10306 | 1061 | 1.3% |
| 10023 | 1053 | 1.2% |
| 10011 | 1048 | 1.2% |
| Other values (176) | 72282 |
| Value | Count | Frequency (%) |
| 0 | 982 | |
| 10001 | 204 | 0.2% |
| 10002 | 328 | 0.4% |
| 10003 | 812 | |
| 10004 | 95 | 0.1% |
| 10005 | 199 | 0.2% |
| 10006 | 184 | 0.2% |
| 10007 | 313 | 0.4% |
| 10009 | 244 | 0.3% |
| 10010 | 459 |
| Value | Count | Frequency (%) |
| 11694 | 273 | 0.3% |
| 11693 | 142 | 0.2% |
| 11692 | 157 | 0.2% |
| 11691 | 435 | |
| 11436 | 312 | |
| 11435 | 525 | |
| 11434 | 705 | |
| 11433 | 442 | |
| 11432 | 533 | |
| 11430 | 1 | < 0.1% |
| Distinct | 176 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.0252638 |
| Minimum | 0 |
|---|---|
| Maximum | 1844 |
| Zeros | 24783 |
| Zeros (%) | 29.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 660.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 4 |
| Maximum | 1844 |
| Range | 1844 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 16.721037 |
|---|---|
| Coefficient of variation (CV) | 8.2562269 |
| Kurtosis | 5299.9341 |
| Mean | 2.0252638 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 60.702733 |
| Sum | 171232 |
| Variance | 279.59308 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 34722 | |
| 0 | 24783 | |
| 2 | 16049 | |
| 3 | 4608 | 5.5% |
| 4 | 1346 | 1.6% |
| 6 | 787 | 0.9% |
| 8 | 332 | 0.4% |
| 5 | 273 | 0.3% |
| 10 | 145 | 0.2% |
| 16 | 122 | 0.1% |
| Other values (166) | 1381 | 1.6% |
| Value | Count | Frequency (%) |
| 0 | 24783 | |
| 1 | 34722 | |
| 2 | 16049 | |
| 3 | 4608 | 5.5% |
| 4 | 1346 | 1.6% |
| 5 | 273 | 0.3% |
| 6 | 787 | 0.9% |
| 7 | 121 | 0.1% |
| 8 | 332 | 0.4% |
| 9 | 113 | 0.1% |
| Value | Count | Frequency (%) |
| 1844 | 2 | |
| 1641 | 1 | < 0.1% |
| 948 | 1 | < 0.1% |
| 894 | 1 | < 0.1% |
| 889 | 1 | < 0.1% |
| 771 | 3 | |
| 716 | 2 | |
| 680 | 2 | |
| 550 | 1 | < 0.1% |
| 529 | 1 | < 0.1% |
| Distinct | 55 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.19355869 |
| Minimum | 0 |
|---|---|
| Maximum | 2261 |
| Zeros | 79429 |
| Zeros (%) | 93.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 660.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 2261 |
| Range | 2261 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 8.7131834 |
|---|---|
| Coefficient of variation (CV) | 45.015718 |
| Kurtosis | 53950.593 |
| Mean | 0.19355869 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 214.40112 |
| Sum | 16365 |
| Variance | 75.919564 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 79429 | |
| 1 | 3558 | 4.2% |
| 2 | 817 | 1.0% |
| 3 | 259 | 0.3% |
| 4 | 137 | 0.2% |
| 5 | 74 | 0.1% |
| 6 | 70 | 0.1% |
| 7 | 31 | < 0.1% |
| 8 | 26 | < 0.1% |
| 9 | 20 | < 0.1% |
| Other values (45) | 127 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 79429 | |
| 1 | 3558 | 4.2% |
| 2 | 817 | 1.0% |
| 3 | 259 | 0.3% |
| 4 | 137 | 0.2% |
| 5 | 74 | 0.1% |
| 6 | 70 | 0.1% |
| 7 | 31 | < 0.1% |
| 8 | 26 | < 0.1% |
| 9 | 20 | < 0.1% |
| Value | Count | Frequency (%) |
| 2261 | 1 | < 0.1% |
| 436 | 2 | |
| 422 | 2 | |
| 318 | 1 | < 0.1% |
| 254 | 4 | |
| 184 | 1 | < 0.1% |
| 172 | 1 | < 0.1% |
| 147 | 1 | < 0.1% |
| 126 | 2 | |
| 91 | 1 | < 0.1% |
| Distinct | 192 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.2491839 |
| Minimum | 0 |
|---|---|
| Maximum | 2261 |
| Zeros | 19762 |
| Zeros (%) | 23.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 660.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 4 |
| Maximum | 2261 |
| Range | 2261 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 18.972584 |
|---|---|
| Coefficient of variation (CV) | 8.4353193 |
| Kurtosis | 5719.5837 |
| Mean | 2.2491839 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 63.448337 |
| Sum | 190164 |
| Variance | 359.95896 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 38356 | |
| 0 | 19762 | |
| 2 | 15914 | |
| 3 | 5412 | 6.4% |
| 4 | 1498 | 1.8% |
| 6 | 870 | 1.0% |
| 5 | 423 | 0.5% |
| 8 | 374 | 0.4% |
| 10 | 198 | 0.2% |
| 7 | 197 | 0.2% |
| Other values (182) | 1544 | 1.8% |
| Value | Count | Frequency (%) |
| 0 | 19762 | |
| 1 | 38356 | |
| 2 | 15914 | |
| 3 | 5412 | 6.4% |
| 4 | 1498 | 1.8% |
| 5 | 423 | 0.5% |
| 6 | 870 | 1.0% |
| 7 | 197 | 0.2% |
| 8 | 374 | 0.4% |
| 9 | 142 | 0.2% |
| Value | Count | Frequency (%) |
| 2261 | 1 | < 0.1% |
| 1866 | 2 | |
| 1653 | 1 | < 0.1% |
| 955 | 1 | < 0.1% |
| 902 | 1 | < 0.1% |
| 889 | 1 | < 0.1% |
| 771 | 3 | |
| 736 | 2 | |
| 680 | 2 | |
| 551 | 1 | < 0.1% |
LAND SQUARE FEET
Categorical
| Distinct | 6062 |
|---|---|
| Distinct (%) | 7.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 660.7 KiB |
| - | |
|---|---|
| 0 | |
| 2000 | |
| 2500 | 3470 |
| 4000 | 3044 |
| Other values (6057) |
Length
| Max length | 7 |
|---|---|
| Median length | 4 |
| Mean length | 3.6486729 |
| Min length | 1 |
Characters and Unicode
| Total characters | 308488 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 3 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2675 ? |
|---|---|
| Unique (%) | 3.2% |
Sample
| 1st row | 1633 |
|---|---|
| 2nd row | 4616 |
| 3rd row | 2212 |
| 4th row | 2272 |
| 5th row | 2369 |
Common Values
| Value | Count | Frequency (%) |
| - | 26252 | |
| 0 | 10326 | 12.2% |
| 2000 | 3919 | 4.6% |
| 2500 | 3470 | 4.1% |
| 4000 | 3044 | 3.6% |
| 1800 | 1192 | 1.4% |
| 3000 | 1190 | 1.4% |
| 5000 | 1009 | 1.2% |
| 2200 | 512 | 0.6% |
| 2400 | 486 | 0.6% |
| Other values (6052) | 33148 |
Length
| Value | Count | Frequency (%) |
| 26252 | ||
| 0 | 10326 | 12.2% |
| 2000 | 3919 | 4.6% |
| 2500 | 3470 | 4.1% |
| 4000 | 3044 | 3.6% |
| 1800 | 1192 | 1.4% |
| 3000 | 1190 | 1.4% |
| 5000 | 1009 | 1.2% |
| 2200 | 512 | 0.6% |
| 2400 | 486 | 0.6% |
| Other values (6052) | 33148 |
Most occurring characters
| Value | Count | Frequency (%) |
| 78756 | ||
| 0 | 78009 | |
| 2 | 28715 | 9.3% |
| - | 26252 | 8.5% |
| 5 | 17835 | 5.8% |
| 1 | 17008 | 5.5% |
| 3 | 13832 | 4.5% |
| 4 | 13603 | 4.4% |
| 8 | 9653 | 3.1% |
| 6 | 9217 | 3.0% |
| Other values (2) | 15608 | 5.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 203480 | |
| Space Separator | 78756 | 25.5% |
| Dash Punctuation | 26252 | 8.5% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 78009 | |
| 2 | 28715 | 14.1% |
| 5 | 17835 | 8.8% |
| 1 | 17008 | 8.4% |
| 3 | 13832 | 6.8% |
| 4 | 13603 | 6.7% |
| 8 | 9653 | 4.7% |
| 6 | 9217 | 4.5% |
| 7 | 8986 | 4.4% |
| 9 | 6622 | 3.3% |
Space Separator
| Value | Count | Frequency (%) |
| 78756 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 26252 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 308488 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 78756 | ||
| 0 | 78009 | |
| 2 | 28715 | 9.3% |
| - | 26252 | 8.5% |
| 5 | 17835 | 5.8% |
| 1 | 17008 | 5.5% |
| 3 | 13832 | 4.5% |
| 4 | 13603 | 4.4% |
| 8 | 9653 | 3.1% |
| 6 | 9217 | 3.0% |
| Other values (2) | 15608 | 5.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 308488 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 78756 | ||
| 0 | 78009 | |
| 2 | 28715 | 9.3% |
| - | 26252 | 8.5% |
| 5 | 17835 | 5.8% |
| 1 | 17008 | 5.5% |
| 3 | 13832 | 4.5% |
| 4 | 13603 | 4.4% |
| 8 | 9653 | 3.1% |
| 6 | 9217 | 3.0% |
| Other values (2) | 15608 | 5.1% |
GROSS SQUARE FEET
Categorical
| Distinct | 5691 |
|---|---|
| Distinct (%) | 6.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 660.7 KiB |
| - | |
|---|---|
| 0 | |
| 2400 | 386 |
| 1800 | 361 |
| 2000 | 359 |
| Other values (5686) |
Length
| Max length | 7 |
|---|---|
| Median length | 4 |
| Mean length | 3.5957799 |
| Min length | 1 |
Characters and Unicode
| Total characters | 304016 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 3 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2365 ? |
|---|---|
| Unique (%) | 2.8% |
Sample
| 1st row | 6440 |
|---|---|
| 2nd row | 18690 |
| 3rd row | 7803 |
| 4th row | 6794 |
| 5th row | 4615 |
Common Values
| Value | Count | Frequency (%) |
| - | 27612 | |
| 0 | 11417 | 13.5% |
| 2400 | 386 | 0.5% |
| 1800 | 361 | 0.4% |
| 2000 | 359 | 0.4% |
| 1600 | 346 | 0.4% |
| 1440 | 340 | 0.4% |
| 3000 | 324 | 0.4% |
| 1200 | 295 | 0.3% |
| 1280 | 281 | 0.3% |
| Other values (5681) | 42827 |
Length
| Value | Count | Frequency (%) |
| 27612 | ||
| 0 | 11417 | 13.5% |
| 2400 | 386 | 0.5% |
| 1800 | 361 | 0.4% |
| 2000 | 359 | 0.4% |
| 1600 | 346 | 0.4% |
| 1440 | 340 | 0.4% |
| 3000 | 324 | 0.4% |
| 1200 | 295 | 0.3% |
| 1280 | 281 | 0.3% |
| Other values (5681) | 42827 |
Most occurring characters
| Value | Count | Frequency (%) |
| 82836 | ||
| 0 | 45165 | |
| 1 | 30980 | 10.2% |
| 2 | 28708 | 9.4% |
| - | 27612 | 9.1% |
| 4 | 15934 | 5.2% |
| 3 | 14746 | 4.9% |
| 6 | 14576 | 4.8% |
| 8 | 14498 | 4.8% |
| 5 | 12031 | 4.0% |
| Other values (2) | 16930 | 5.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 193568 | |
| Space Separator | 82836 | |
| Dash Punctuation | 27612 | 9.1% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 45165 | |
| 1 | 30980 | |
| 2 | 28708 | |
| 4 | 15934 | 8.2% |
| 3 | 14746 | 7.6% |
| 6 | 14576 | 7.5% |
| 8 | 14498 | 7.5% |
| 5 | 12031 | 6.2% |
| 9 | 8593 | 4.4% |
| 7 | 8337 | 4.3% |
Space Separator
| Value | Count | Frequency (%) |
| 82836 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 27612 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 304016 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 82836 | ||
| 0 | 45165 | |
| 1 | 30980 | 10.2% |
| 2 | 28708 | 9.4% |
| - | 27612 | 9.1% |
| 4 | 15934 | 5.2% |
| 3 | 14746 | 4.9% |
| 6 | 14576 | 4.8% |
| 8 | 14498 | 4.8% |
| 5 | 12031 | 4.0% |
| Other values (2) | 16930 | 5.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 304016 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 82836 | ||
| 0 | 45165 | |
| 1 | 30980 | 10.2% |
| 2 | 28708 | 9.4% |
| - | 27612 | 9.1% |
| 4 | 15934 | 5.2% |
| 3 | 14746 | 4.9% |
| 6 | 14576 | 4.8% |
| 8 | 14498 | 4.8% |
| 5 | 12031 | 4.0% |
| Other values (2) | 16930 | 5.6% |
| Distinct | 158 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1789.323 |
| Minimum | 0 |
|---|---|
| Maximum | 2017 |
| Zeros | 6970 |
| Zeros (%) | 8.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 660.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1920 |
| median | 1940 |
| Q3 | 1965 |
| 95-th percentile | 2013 |
| Maximum | 2017 |
| Range | 2017 |
| Interquartile range (IQR) | 45 |
Descriptive statistics
| Standard deviation | 537.34499 |
|---|---|
| Coefficient of variation (CV) | 0.30030632 |
| Kurtosis | 7.1463801 |
| Mean | 1789.323 |
| Median Absolute Deviation (MAD) | 23 |
| Skewness | -3.016062 |
| Sum | 1.5128368 × 108 |
| Variance | 288739.64 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 6970 | 8.2% |
| 1920 | 6045 | 7.1% |
| 1930 | 5043 | 6.0% |
| 1925 | 4312 | 5.1% |
| 1910 | 3585 | 4.2% |
| 1950 | 3156 | 3.7% |
| 1960 | 2654 | 3.1% |
| 1940 | 2456 | 2.9% |
| 1931 | 2246 | 2.7% |
| 1955 | 1961 | 2.3% |
| Other values (148) | 46120 |
| Value | Count | Frequency (%) |
| 0 | 6970 | |
| 1111 | 1 | < 0.1% |
| 1680 | 1 | < 0.1% |
| 1800 | 37 | < 0.1% |
| 1826 | 1 | < 0.1% |
| 1829 | 1 | < 0.1% |
| 1832 | 1 | < 0.1% |
| 1835 | 2 | < 0.1% |
| 1840 | 2 | < 0.1% |
| 1844 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 2017 | 6 | < 0.1% |
| 2016 | 794 | |
| 2015 | 1470 | |
| 2014 | 1232 | |
| 2013 | 743 | |
| 2012 | 276 | 0.3% |
| 2011 | 154 | 0.2% |
| 2010 | 358 | 0.4% |
| 2009 | 579 | 0.7% |
| 2008 | 935 |
TAX CLASS AT TIME OF SALE
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 660.7 KiB |
| 1 | |
|---|---|
| 2 | |
| 4 | |
| 3 | 4 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 84548 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2 |
|---|---|
| 2nd row | 2 |
| 3rd row | 2 |
| 4th row | 2 |
| 5th row | 2 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 41533 | |
| 2 | 36726 | |
| 4 | 6285 | 7.4% |
| 3 | 4 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1 | 41533 | |
| 2 | 36726 | |
| 4 | 6285 | 7.4% |
| 3 | 4 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 41533 | |
| 2 | 36726 | |
| 4 | 6285 | 7.4% |
| 3 | 4 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 84548 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 41533 | |
| 2 | 36726 | |
| 4 | 6285 | 7.4% |
| 3 | 4 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 84548 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 41533 | |
| 2 | 36726 | |
| 4 | 6285 | 7.4% |
| 3 | 4 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 84548 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 41533 | |
| 2 | 36726 | |
| 4 | 6285 | 7.4% |
| 3 | 4 | < 0.1% |
BUILDING CLASS AT TIME OF SALE
Categorical
| Distinct | 166 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 660.7 KiB |
| R4 | |
|---|---|
| D4 | |
| A1 | |
| A5 | |
| B2 | |
| Other values (161) |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Characters and Unicode
| Total characters | 169096 |
|---|---|
| Distinct characters | 35 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 13 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | C2 |
|---|---|
| 2nd row | C7 |
| 3rd row | C7 |
| 4th row | C4 |
| 5th row | C2 |
Common Values
| Value | Count | Frequency (%) |
| R4 | 12989 | |
| D4 | 12666 | |
| A1 | 6751 | 8.0% |
| A5 | 5671 | 6.7% |
| B2 | 4918 | 5.8% |
| B1 | 4747 | 5.6% |
| C0 | 4384 | 5.2% |
| B3 | 3821 | 4.5% |
| A2 | 2867 | 3.4% |
| C6 | 2760 | 3.3% |
| Other values (156) | 22974 |
Length
| Value | Count | Frequency (%) |
| r4 | 12989 | |
| d4 | 12666 | |
| a1 | 6751 | 8.0% |
| a5 | 5671 | 6.7% |
| b2 | 4918 | 5.8% |
| b1 | 4747 | 5.6% |
| c0 | 4384 | 5.2% |
| b3 | 3821 | 4.5% |
| a2 | 2867 | 3.4% |
| c6 | 2760 | 3.3% |
| Other values (156) | 22974 |
Most occurring characters
| Value | Count | Frequency (%) |
| 4 | 26664 | |
| R | 21018 | |
| A | 17875 | |
| B | 15508 | |
| 1 | 15445 | |
| D | 13284 | |
| 2 | 10784 | |
| C | 10617 | 6.3% |
| 3 | 7129 | 4.2% |
| 0 | 6488 | 3.8% |
| Other values (25) | 24284 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 87366 | |
| Decimal Number | 81730 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| R | 21018 | |
| A | 17875 | |
| B | 15508 | |
| D | 13284 | |
| C | 10617 | |
| S | 2221 | 2.5% |
| G | 1873 | 2.1% |
| V | 1711 | 2.0% |
| K | 1089 | 1.2% |
| P | 353 | 0.4% |
| Other values (15) | 1817 | 2.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 26664 | |
| 1 | 15445 | |
| 2 | 10784 | |
| 3 | 7129 | 8.7% |
| 0 | 6488 | 7.9% |
| 5 | 6242 | 7.6% |
| 9 | 4827 | 5.9% |
| 6 | 3198 | 3.9% |
| 7 | 767 | 0.9% |
| 8 | 186 | 0.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 87366 | |
| Common | 81730 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| R | 21018 | |
| A | 17875 | |
| B | 15508 | |
| D | 13284 | |
| C | 10617 | |
| S | 2221 | 2.5% |
| G | 1873 | 2.1% |
| V | 1711 | 2.0% |
| K | 1089 | 1.2% |
| P | 353 | 0.4% |
| Other values (15) | 1817 | 2.1% |
Common
| Value | Count | Frequency (%) |
| 4 | 26664 | |
| 1 | 15445 | |
| 2 | 10784 | |
| 3 | 7129 | 8.7% |
| 0 | 6488 | 7.9% |
| 5 | 6242 | 7.6% |
| 9 | 4827 | 5.9% |
| 6 | 3198 | 3.9% |
| 7 | 767 | 0.9% |
| 8 | 186 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 169096 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 4 | 26664 | |
| R | 21018 | |
| A | 17875 | |
| B | 15508 | |
| 1 | 15445 | |
| D | 13284 | |
| 2 | 10784 | |
| C | 10617 | 6.3% |
| 3 | 7129 | 4.2% |
| 0 | 6488 | 3.8% |
| Other values (25) | 24284 |
SALE PRICE
Categorical
| Distinct | 10008 |
|---|---|
| Distinct (%) | 11.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 660.7 KiB |
| - | |
|---|---|
| 0 | |
| 10 | 766 |
| 450000 | 427 |
| 550000 | 416 |
| Other values (10003) |
Length
| Max length | 10 |
|---|---|
| Median length | 9 |
| Mean length | 5.1760302 |
| Min length | 1 |
Characters and Unicode
| Total characters | 437623 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 3 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 6916 ? |
|---|---|
| Unique (%) | 8.2% |
Sample
| 1st row | 6625000 |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | 3936272 |
| 5th row | 8000000 |
Common Values
| Value | Count | Frequency (%) |
| - | 14561 | 17.2% |
| 0 | 10228 | 12.1% |
| 10 | 766 | 0.9% |
| 450000 | 427 | 0.5% |
| 550000 | 416 | 0.5% |
| 650000 | 414 | 0.5% |
| 600000 | 409 | 0.5% |
| 700000 | 382 | 0.5% |
| 400000 | 378 | 0.4% |
| 750000 | 377 | 0.4% |
| Other values (9998) | 56190 |
Length
| Value | Count | Frequency (%) |
| 14561 | 17.2% | |
| 0 | 10228 | 12.1% |
| 10 | 766 | 0.9% |
| 450000 | 427 | 0.5% |
| 550000 | 416 | 0.5% |
| 650000 | 414 | 0.5% |
| 600000 | 409 | 0.5% |
| 700000 | 382 | 0.5% |
| 400000 | 378 | 0.4% |
| 750000 | 377 | 0.4% |
| Other values (9998) | 56190 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 201287 | |
| 43683 | 10.0% | |
| 5 | 35469 | 8.1% |
| 1 | 24639 | 5.6% |
| 2 | 20911 | 4.8% |
| 3 | 17749 | 4.1% |
| 4 | 16596 | 3.8% |
| 7 | 16167 | 3.7% |
| 9 | 16072 | 3.7% |
| 6 | 15734 | 3.6% |
| Other values (2) | 29316 | 6.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 379379 | |
| Space Separator | 43683 | 10.0% |
| Dash Punctuation | 14561 | 3.3% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 201287 | |
| 5 | 35469 | 9.3% |
| 1 | 24639 | 6.5% |
| 2 | 20911 | 5.5% |
| 3 | 17749 | 4.7% |
| 4 | 16596 | 4.4% |
| 7 | 16167 | 4.3% |
| 9 | 16072 | 4.2% |
| 6 | 15734 | 4.1% |
| 8 | 14755 | 3.9% |
Space Separator
| Value | Count | Frequency (%) |
| 43683 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 14561 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 437623 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 201287 | |
| 43683 | 10.0% | |
| 5 | 35469 | 8.1% |
| 1 | 24639 | 5.6% |
| 2 | 20911 | 4.8% |
| 3 | 17749 | 4.1% |
| 4 | 16596 | 3.8% |
| 7 | 16167 | 3.7% |
| 9 | 16072 | 3.7% |
| 6 | 15734 | 3.6% |
| Other values (2) | 29316 | 6.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 437623 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 201287 | |
| 43683 | 10.0% | |
| 5 | 35469 | 8.1% |
| 1 | 24639 | 5.6% |
| 2 | 20911 | 4.8% |
| 3 | 17749 | 4.1% |
| 4 | 16596 | 3.8% |
| 7 | 16167 | 3.7% |
| 9 | 16072 | 3.7% |
| 6 | 15734 | 3.6% |
| Other values (2) | 29316 | 6.7% |
SALE DATE
Categorical
| Distinct | 364 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 660.7 KiB |
| 2017-06-29 00:00:00 | 544 |
|---|---|
| 2017-06-15 00:00:00 | 530 |
| 2016-12-22 00:00:00 | 527 |
| 2017-05-25 00:00:00 | 511 |
| 2016-10-06 00:00:00 | 508 |
| Other values (359) |
Length
| Max length | 19 |
|---|---|
| Median length | 19 |
| Mean length | 19 |
| Min length | 19 |
Characters and Unicode
| Total characters | 1606412 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 4 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 2017-07-19 00:00:00 |
|---|---|
| 2nd row | 2016-12-14 00:00:00 |
| 3rd row | 2016-12-09 00:00:00 |
| 4th row | 2016-09-23 00:00:00 |
| 5th row | 2016-11-17 00:00:00 |
Common Values
| Value | Count | Frequency (%) |
| 2017-06-29 00:00:00 | 544 | 0.6% |
| 2017-06-15 00:00:00 | 530 | 0.6% |
| 2016-12-22 00:00:00 | 527 | 0.6% |
| 2017-05-25 00:00:00 | 511 | 0.6% |
| 2016-10-06 00:00:00 | 508 | 0.6% |
| 2017-06-30 00:00:00 | 493 | 0.6% |
| 2017-03-30 00:00:00 | 493 | 0.6% |
| 2016-10-28 00:00:00 | 493 | 0.6% |
| 2016-09-22 00:00:00 | 489 | 0.6% |
| 2016-09-29 00:00:00 | 474 | 0.6% |
| Other values (354) | 79486 |
Length
| Value | Count | Frequency (%) |
| 00:00:00 | 84548 | |
| 2017-06-29 | 544 | 0.3% |
| 2017-06-15 | 530 | 0.3% |
| 2016-12-22 | 527 | 0.3% |
| 2017-05-25 | 511 | 0.3% |
| 2016-10-06 | 508 | 0.3% |
| 2017-06-30 | 493 | 0.3% |
| 2017-03-30 | 493 | 0.3% |
| 2016-10-28 | 493 | 0.3% |
| 2016-09-22 | 489 | 0.3% |
| Other values (355) | 79960 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 693670 | |
| - | 169096 | 10.5% |
| : | 169096 | 10.5% |
| 1 | 157522 | 9.8% |
| 2 | 135506 | 8.4% |
| 84548 | 5.3% | |
| 7 | 71062 | 4.4% |
| 6 | 46100 | 2.9% |
| 3 | 21154 | 1.3% |
| 9 | 15627 | 1.0% |
| Other values (3) | 43031 | 2.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1183672 | |
| Dash Punctuation | 169096 | 10.5% |
| Other Punctuation | 169096 | 10.5% |
| Space Separator | 84548 | 5.3% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 693670 | |
| 1 | 157522 | 13.3% |
| 2 | 135506 | 11.4% |
| 7 | 71062 | 6.0% |
| 6 | 46100 | 3.9% |
| 3 | 21154 | 1.8% |
| 9 | 15627 | 1.3% |
| 5 | 14766 | 1.2% |
| 8 | 14473 | 1.2% |
| 4 | 13792 | 1.2% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 169096 |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 169096 |
Space Separator
| Value | Count | Frequency (%) |
| 84548 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1606412 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 693670 | |
| - | 169096 | 10.5% |
| : | 169096 | 10.5% |
| 1 | 157522 | 9.8% |
| 2 | 135506 | 8.4% |
| 84548 | 5.3% | |
| 7 | 71062 | 4.4% |
| 6 | 46100 | 2.9% |
| 3 | 21154 | 1.3% |
| 9 | 15627 | 1.0% |
| Other values (3) | 43031 | 2.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1606412 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 693670 | |
| - | 169096 | 10.5% |
| : | 169096 | 10.5% |
| 1 | 157522 | 9.8% |
| 2 | 135506 | 8.4% |
| 84548 | 5.3% | |
| 7 | 71062 | 4.4% |
| 6 | 46100 | 2.9% |
| 3 | 21154 | 1.3% |
| 9 | 15627 | 1.0% |
| Other values (3) | 43031 | 2.7% |
Auto
The auto setting is an interpretable pairwise column metric of the following mapping:- Variable_type-Variable_type : Method, Range
- Categorical-Categorical : Cramer's V, [0,1]
- Numerical-Categorical : Cramer's V, [0,1] (using a discretized numerical column)
- Numerical-Numerical : Spearman's ρ, [-1,1]
This configuration uses the recommended metric for each pair of columns.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.| Unnamed: 0 | BOROUGH | NEIGHBORHOOD | BUILDING CLASS CATEGORY | TAX CLASS AT PRESENT | BLOCK | LOT | EASE-MENT | BUILDING CLASS AT PRESENT | ADDRESS | APARTMENT NUMBER | ZIP CODE | RESIDENTIAL UNITS | COMMERCIAL UNITS | TOTAL UNITS | LAND SQUARE FEET | GROSS SQUARE FEET | YEAR BUILT | TAX CLASS AT TIME OF SALE | BUILDING CLASS AT TIME OF SALE | SALE PRICE | SALE DATE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 4 | 1 | ALPHABET CITY | 07 RENTALS - WALKUP APARTMENTS | 2A | 392 | 6 | C2 | 153 AVENUE B | 10009 | 5 | 0 | 5 | 1633 | 6440 | 1900 | 2 | C2 | 6625000 | 2017-07-19 00:00:00 | ||
| 1 | 5 | 1 | ALPHABET CITY | 07 RENTALS - WALKUP APARTMENTS | 2 | 399 | 26 | C7 | 234 EAST 4TH STREET | 10009 | 28 | 3 | 31 | 4616 | 18690 | 1900 | 2 | C7 | - | 2016-12-14 00:00:00 | ||
| 2 | 6 | 1 | ALPHABET CITY | 07 RENTALS - WALKUP APARTMENTS | 2 | 399 | 39 | C7 | 197 EAST 3RD STREET | 10009 | 16 | 1 | 17 | 2212 | 7803 | 1900 | 2 | C7 | - | 2016-12-09 00:00:00 | ||
| 3 | 7 | 1 | ALPHABET CITY | 07 RENTALS - WALKUP APARTMENTS | 2B | 402 | 21 | C4 | 154 EAST 7TH STREET | 10009 | 10 | 0 | 10 | 2272 | 6794 | 1913 | 2 | C4 | 3936272 | 2016-09-23 00:00:00 | ||
| 4 | 8 | 1 | ALPHABET CITY | 07 RENTALS - WALKUP APARTMENTS | 2A | 404 | 55 | C2 | 301 EAST 10TH STREET | 10009 | 6 | 0 | 6 | 2369 | 4615 | 1900 | 2 | C2 | 8000000 | 2016-11-17 00:00:00 | ||
| 5 | 9 | 1 | ALPHABET CITY | 07 RENTALS - WALKUP APARTMENTS | 2 | 405 | 16 | C4 | 516 EAST 12TH STREET | 10009 | 20 | 0 | 20 | 2581 | 9730 | 1900 | 2 | C4 | - | 2017-07-20 00:00:00 | ||
| 6 | 10 | 1 | ALPHABET CITY | 07 RENTALS - WALKUP APARTMENTS | 2B | 406 | 32 | C4 | 210 AVENUE B | 10009 | 8 | 0 | 8 | 1750 | 4226 | 1920 | 2 | C4 | 3192840 | 2016-09-23 00:00:00 | ||
| 7 | 11 | 1 | ALPHABET CITY | 07 RENTALS - WALKUP APARTMENTS | 2 | 407 | 18 | C7 | 520 EAST 14TH STREET | 10009 | 44 | 2 | 46 | 5163 | 21007 | 1900 | 2 | C7 | - | 2017-07-20 00:00:00 | ||
| 8 | 12 | 1 | ALPHABET CITY | 08 RENTALS - ELEVATOR APARTMENTS | 2 | 379 | 34 | D5 | 141 AVENUE D | 10009 | 15 | 0 | 15 | 1534 | 9198 | 1920 | 2 | D5 | - | 2017-06-20 00:00:00 | ||
| 9 | 13 | 1 | ALPHABET CITY | 08 RENTALS - ELEVATOR APARTMENTS | 2 | 387 | 153 | D9 | 629 EAST 5TH STREET | 10009 | 24 | 0 | 24 | 4489 | 18523 | 1920 | 2 | D9 | 16232000 | 2016-11-07 00:00:00 |
| Unnamed: 0 | BOROUGH | NEIGHBORHOOD | BUILDING CLASS CATEGORY | TAX CLASS AT PRESENT | BLOCK | LOT | EASE-MENT | BUILDING CLASS AT PRESENT | ADDRESS | APARTMENT NUMBER | ZIP CODE | RESIDENTIAL UNITS | COMMERCIAL UNITS | TOTAL UNITS | LAND SQUARE FEET | GROSS SQUARE FEET | YEAR BUILT | TAX CLASS AT TIME OF SALE | BUILDING CLASS AT TIME OF SALE | SALE PRICE | SALE DATE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 84538 | 8404 | 5 | WOODROW | 02 TWO FAMILY DWELLINGS | 1 | 7316 | 61 | B2 | 178 DARNELL LANE | 10309 | 2 | 0 | 2 | 3215 | 1300 | 1995 | 1 | B2 | - | 2017-06-30 00:00:00 | ||
| 84539 | 8405 | 5 | WOODROW | 02 TWO FAMILY DWELLINGS | 1 | 7316 | 85 | B2 | 137 DARNELL LANE | 10309 | 2 | 0 | 2 | 3016 | 1300 | 1995 | 1 | B2 | - | 2016-12-30 00:00:00 | ||
| 84540 | 8406 | 5 | WOODROW | 02 TWO FAMILY DWELLINGS | 1 | 7316 | 93 | B2 | 125 DARNELL LANE | 10309 | 2 | 0 | 2 | 3325 | 1300 | 1995 | 1 | B2 | 509000 | 2016-10-31 00:00:00 | ||
| 84541 | 8407 | 5 | WOODROW | 02 TWO FAMILY DWELLINGS | 1 | 7317 | 126 | B2 | 112 ROBIN COURT | 10309 | 2 | 0 | 2 | 11088 | 2160 | 1994 | 1 | B2 | 648000 | 2016-12-07 00:00:00 | ||
| 84542 | 8408 | 5 | WOODROW | 02 TWO FAMILY DWELLINGS | 1 | 7339 | 41 | B9 | 41 SONIA COURT | 10309 | 2 | 0 | 2 | 3020 | 1800 | 1997 | 1 | B9 | - | 2016-12-01 00:00:00 | ||
| 84543 | 8409 | 5 | WOODROW | 02 TWO FAMILY DWELLINGS | 1 | 7349 | 34 | B9 | 37 QUAIL LANE | 10309 | 2 | 0 | 2 | 2400 | 2575 | 1998 | 1 | B9 | 450000 | 2016-11-28 00:00:00 | ||
| 84544 | 8410 | 5 | WOODROW | 02 TWO FAMILY DWELLINGS | 1 | 7349 | 78 | B9 | 32 PHEASANT LANE | 10309 | 2 | 0 | 2 | 2498 | 2377 | 1998 | 1 | B9 | 550000 | 2017-04-21 00:00:00 | ||
| 84545 | 8411 | 5 | WOODROW | 02 TWO FAMILY DWELLINGS | 1 | 7351 | 60 | B2 | 49 PITNEY AVENUE | 10309 | 2 | 0 | 2 | 4000 | 1496 | 1925 | 1 | B2 | 460000 | 2017-07-05 00:00:00 | ||
| 84546 | 8412 | 5 | WOODROW | 22 STORE BUILDINGS | 4 | 7100 | 28 | K6 | 2730 ARTHUR KILL ROAD | 10309 | 0 | 7 | 7 | 208033 | 64117 | 2001 | 4 | K6 | 11693337 | 2016-12-21 00:00:00 | ||
| 84547 | 8413 | 5 | WOODROW | 35 INDOOR PUBLIC AND CULTURAL FACILITIES | 4 | 7105 | 679 | P9 | 155 CLAY PIT ROAD | 10309 | 0 | 1 | 1 | 10796 | 2400 | 2006 | 4 | P9 | 69300 | 2016-10-27 00:00:00 |